Programação de Processadores Massivamente Paralelos: Uma Abordagem Prática: O Nascimento do Computação por GPU

O nascimento da GPU foi uma ruptura radical impulsionada pelo "imperativo em tempo real": a exigência inegociável de renderizar cenas 3D complexas dentro de uma janela de $1/60^{th}$ de segundo (16,67ms). Enquanto os CPUs seguiram uma trajetória multicore otimizada para execução serial de baixa latência, eles atingiram um limite à medida que as resoluções aumentaram.

1. A Restrição de 16,67ms

Na metade dos anos 90, o gaming alcançou uma crise. Um CPU serial, lidando com IA e física, não conseguia calcular milhões de valores de pixels rapidamente o suficiente para manter o movimento fluido. Isso forçou a criação de hardware dedicado para deslocar a repetitiva pipeline gráfico.

2. Interleaving de Linhas de Varredura (SLI)

Antes de arranjos paralelos internos, a 3dfx introduziu Interleaving de Linhas de Varredura (SLI). Usando dois cartões físicos para calcular linhas horizontais alternadas, a indústria mudou seu foco da velocidade de um único thread para o throughput bruto de "força bruta".

3. Throughput versus Latência

A gênese da GPU priorizou a área de silício para unidades aritméticas simples em vez de predição complexa de ramificações. Essa filosofia de "largura e lentidão" permitiu que as GPUs manipulassem a matemática repetitiva de triângulos enquanto o CPU se concentrava em lógica não paralela.

TERMINALbash — 80x24

> Ready. Click "Run" to execute.

QUESTION 1

What is the specific 'time budget' required for 60 frames per second (FPS)?

33.33ms

16.67ms

10.00ms

100.00ms

QUESTION 2

How did 3dfx's SLI achieve early parallelism in consumer hardware?

By increasing the clock speed of a single chip.

By having two cards render alternating horizontal scan lines.

By sharing AI logic between the GPU and CPU.

By reducing the resolution of the frame.

QUESTION 3

Why did the GPU diverge from the standard multicore trajectory of CPUs?

GPUs needed deeper caches for complex branching.

GPUs prioritize throughput of simple math over low-latency serial logic.

CPUs became too expensive to manufacture for 3D graphics.

GPU architectures were designed to be smaller than CPUs.

QUESTION 4

In the context of 1990s gaming, what was the 'Real-Time Imperative'?

The requirement to run physics simulations on the GPU.

Processing millions of pixels within the strict frame window.

The transition from 16-bit to 32-bit computing.

Allowing the CPU to handle rasterization.

QUESTION 5

What is meant by the GPU's 'Wide and Slow' philosophy?

Using many simple processors at lower clock speeds to do massive work.

Designing physically wide chips that take longer to process data.

A design that favors high latency but high memory capacity.

Optimizing for single-threaded serial logic.